FinTech Fraud Detection — This notebook focuses on the Credit Card Fraud dataset


Step 1: Exploratory Data Analysis (EDA)¶

In this step, we will:

  • Understand the dataset
  • Check for missing values
  • Visualize distributions
  • Identify early indicators of fraudulent transactions
In [75]:
# Load libraries

!pip install plotly

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import warnings
import plotly.express as px

warnings.filterwarnings('ignore')
plt.style.use('seaborn-v0_8')
sns.set_palette("coolwarm")
Requirement already satisfied: plotly in ./myenv/lib/python3.12/site-packages (6.3.1)
Requirement already satisfied: narwhals>=1.15.1 in ./myenv/lib/python3.12/site-packages (from plotly) (2.8.0)
Requirement already satisfied: packaging in ./myenv/lib/python3.12/site-packages (from plotly) (25.0)
In [57]:
# Load the Dataset - Adjust path if necessary

Credit = pd.read_csv("/mnt/c/1.MorganeCanada/Project-2-/Data/CreditCard_FraudDetection.csv")
df.head()
Out[57]:
Time V1 V2 V3 V4 V5 V6 V7 V8 V9 ... V22 V23 V24 V25 V26 V27 V28 Amount Class Hour
0 0.0 -1.359807 -0.072781 2.536347 1.378155 -0.338321 0.462388 0.239599 0.098698 0.363787 ... 0.277838 -0.110474 0.066928 0.128539 -0.189115 0.133558 -0.021053 149.62 0 0
1 0.0 1.191857 0.266151 0.166480 0.448154 0.060018 -0.082361 -0.078803 0.085102 -0.255425 ... -0.638672 0.101288 -0.339846 0.167170 0.125895 -0.008983 0.014724 2.69 0 0
2 1.0 -1.358354 -1.340163 1.773209 0.379780 -0.503198 1.800499 0.791461 0.247676 -1.514654 ... 0.771679 0.909412 -0.689281 -0.327642 -0.139097 -0.055353 -0.059752 378.66 0 0
3 1.0 -0.966272 -0.185226 1.792993 -0.863291 -0.010309 1.247203 0.237609 0.377436 -1.387024 ... 0.005274 -0.190321 -1.175575 0.647376 -0.221929 0.062723 0.061458 123.50 0 0
4 2.0 -1.158233 0.877737 1.548718 0.403034 -0.407193 0.095921 0.592941 -0.270533 0.817739 ... 0.798278 -0.137458 0.141267 -0.206010 0.502292 0.219422 0.215153 69.99 0 0

5 rows × 32 columns

In [58]:
# Basic Overview - We’ll inspect shape, column types, missing values, and a few summary statistics.

print("Shape:", Credit.shape)
print("\nInfo:")
print(Credit.info())
print("\nMissing values:", Credit.isnull().sum().sum())

Credit.describe().T.head(10)
Shape: (284807, 31)

Info:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 284807 entries, 0 to 284806
Data columns (total 31 columns):
 #   Column  Non-Null Count   Dtype  
---  ------  --------------   -----  
 0   Time    284807 non-null  float64
 1   V1      284807 non-null  float64
 2   V2      284807 non-null  float64
 3   V3      284807 non-null  float64
 4   V4      284807 non-null  float64
 5   V5      284807 non-null  float64
 6   V6      284807 non-null  float64
 7   V7      284807 non-null  float64
 8   V8      284807 non-null  float64
 9   V9      284807 non-null  float64
 10  V10     284807 non-null  float64
 11  V11     284807 non-null  float64
 12  V12     284807 non-null  float64
 13  V13     284807 non-null  float64
 14  V14     284807 non-null  float64
 15  V15     284807 non-null  float64
 16  V16     284807 non-null  float64
 17  V17     284807 non-null  float64
 18  V18     284807 non-null  float64
 19  V19     284807 non-null  float64
 20  V20     284807 non-null  float64
 21  V21     284807 non-null  float64
 22  V22     284807 non-null  float64
 23  V23     284807 non-null  float64
 24  V24     284807 non-null  float64
 25  V25     284807 non-null  float64
 26  V26     284807 non-null  float64
 27  V27     284807 non-null  float64
 28  V28     284807 non-null  float64
 29  Amount  284807 non-null  float64
 30  Class   284807 non-null  int64  
dtypes: float64(30), int64(1)
memory usage: 67.4 MB
None

Missing values: 0
Out[58]:
count mean std min 25% 50% 75% max
Time 284807.0 9.481386e+04 47488.145955 0.000000 54201.500000 84692.000000 139320.500000 172792.000000
V1 284807.0 1.759088e-12 1.958696 -56.407510 -0.920373 0.018109 1.315642 2.454930
V2 284807.0 -8.251210e-13 1.651309 -72.715728 -0.598550 0.065486 0.803724 22.057729
V3 284807.0 -9.655224e-13 1.516255 -48.325589 -0.890365 0.179846 1.027196 9.382558
V4 284807.0 8.321417e-13 1.415869 -5.683171 -0.848640 -0.019847 0.743341 16.875344
V5 284807.0 1.650335e-13 1.380247 -113.743307 -0.691597 -0.054336 0.611926 34.801666
V6 284807.0 4.248462e-13 1.332271 -26.160506 -0.768296 -0.274187 0.398565 73.301626
V7 284807.0 -3.054652e-13 1.237094 -43.557242 -0.554076 0.040103 0.570436 120.589494
V8 284807.0 8.777941e-14 1.194353 -73.216718 -0.208630 0.022358 0.327346 20.007208
V9 284807.0 -1.179734e-12 1.098632 -13.434066 -0.643098 -0.051429 0.597139 15.594995
In [77]:
# Target Variable Distribution - The dataset is highly imbalanced — this will influence our modeling strategy later.

fig = px.histogram(Credit, x='Class', color='Class',
                   color_discrete_map={0: "skyblue", 1: "red"},
                   title="Fraud (1) vs Non-Fraud (0)",
                   text_auto=True)

# Show percentage in annotation (optional)

fraud_ratio = df['Class'].value_counts(normalize=True)[1] * 100
fig.update_layout(
    annotations=[dict(
        x=0.5,
        y=1.05,
        xref='paper',
        yref='paper',
        text=f"Fraudulent transactions represent only {fraud_ratio:.3f}% of total data",
        showarrow=False,
        font=dict(size=14))])
fig.show(renderer="notebook_connected")
In [60]:
# Transaction Amount Distribution

plt.figure(figsize=(8,5))
sns.histplot(df['Amount'], bins=100, kde=True)
plt.title("Distribution of Transaction Amounts")
plt.xlabel("Transaction Amount")
plt.show()
No description has been provided for this image
In [78]:
# Temporal Analysis - The Time variable represents seconds elapsed since the first transaction. We'll create an Hour feature to see if fraud clusters at specific times.

Credit['Hour'] = ((Credit['Time'] // 3600) % 24).astype(int)

# Create interactive histogram
fig = px.histogram(
    Credit, 
    x='Hour', 
    color='Class',
    barmode='group',  # side-by-side bars
    color_discrete_map={0: "skyblue", 1: "red"},
    title="Fraud Frequency by Hour of Day",
    labels={'Class': 'Transaction Class', 'Hour': 'Hour of Day'},
    text_auto=True)

fig.update_layout(
    xaxis=dict(dtick=1),  # show every hour tick
    yaxis_title="Count",
    legend_title="Class")

fig.show(renderer="notebook_connected")
In [62]:
# Correlation Analysis 

hourly_fraud_corr = Credit.groupby('Hour').apply(lambda x: x['Amount'].corr(x['Class'])).reset_index(name='Correlation')

# Create a color list: red if correlation > 0, blue if < 0

colors = ['#FF6B6B' if val > 0 else '#4D96FF' for val in hourly_fraud_corr['Correlation']]

plt.figure(figsize=(10,5))
sns.barplot(x='Hour', y='Correlation', data=hourly_fraud_corr, palette=colors)

plt.title("Correlation of Amount with Fraud by Hour")
plt.ylabel("Correlation with Fraud (Class)")
plt.xlabel("Hour of Day")
plt.show()
No description has been provided for this image
In [79]:
# Interactive Visualization (Plotly)

import plotly.express as px
import plotly.io as pio

pio.renderers.default = "notebook"  # or "notebook_connected" / "jupyterlab"

fig = px.histogram(df, x="Amount", color="Class", nbins=60,
                   barmode="overlay", title="Transaction Amount by Fraud Status",
                   color_discrete_map={'0':'blue','1':'red'})
fig.show(renderer="notebook_connected")

Key Insights

  • The dataset contains highly imbalanced classes (~0.17% fraud).
  • Fraudulent transactions tend to occur more often at certain hours (check correlation plot).
  • Scaling and class balancing will be essential for accurate modeling.

Next step

  • Data cleaning,
  • Feature engineering
  • first ML models.
In [64]:
# Load datasets

credit = pd.read_csv("/mnt/c/1.MorganeCanada/Project-2-/Data/CreditCard_FraudDetection.csv")

print("CREDIT CARD DATA")
print("Shape:", credit.shape)
print(credit.head())
print("\nColumns:", credit.columns)

print("\n" + "="*60 + "\n")
CREDIT CARD DATA
Shape: (284807, 31)
   Time        V1        V2        V3        V4        V5        V6        V7  \
0   0.0 -1.359807 -0.072781  2.536347  1.378155 -0.338321  0.462388  0.239599   
1   0.0  1.191857  0.266151  0.166480  0.448154  0.060018 -0.082361 -0.078803   
2   1.0 -1.358354 -1.340163  1.773209  0.379780 -0.503198  1.800499  0.791461   
3   1.0 -0.966272 -0.185226  1.792993 -0.863291 -0.010309  1.247203  0.237609   
4   2.0 -1.158233  0.877737  1.548718  0.403034 -0.407193  0.095921  0.592941   

         V8        V9  ...       V21       V22       V23       V24       V25  \
0  0.098698  0.363787  ... -0.018307  0.277838 -0.110474  0.066928  0.128539   
1  0.085102 -0.255425  ... -0.225775 -0.638672  0.101288 -0.339846  0.167170   
2  0.247676 -1.514654  ...  0.247998  0.771679  0.909412 -0.689281 -0.327642   
3  0.377436 -1.387024  ... -0.108300  0.005274 -0.190321 -1.175575  0.647376   
4 -0.270533  0.817739  ... -0.009431  0.798278 -0.137458  0.141267 -0.206010   

        V26       V27       V28  Amount  Class  
0 -0.189115  0.133558 -0.021053  149.62      0  
1  0.125895 -0.008983  0.014724    2.69      0  
2 -0.139097 -0.055353 -0.059752  378.66      0  
3 -0.221929  0.062723  0.061458  123.50      0  
4  0.502292  0.219422  0.215153   69.99      0  

[5 rows x 31 columns]

Columns: Index(['Time', 'V1', 'V2', 'V3', 'V4', 'V5', 'V6', 'V7', 'V8', 'V9', 'V10',
       'V11', 'V12', 'V13', 'V14', 'V15', 'V16', 'V17', 'V18', 'V19', 'V20',
       'V21', 'V22', 'V23', 'V24', 'V25', 'V26', 'V27', 'V28', 'Amount',
       'Class'],
      dtype='object')

============================================================

FEATURE ENGINEERING FOR CREDIT CARD FRAUD DATA¶

In [65]:
# Convert time in seconds to hours (0–23)
    
credit['hour'] = (credit['Time'] // 3600) % 24

# Night transactions between 10 PM and 6 AM

credit['is_night'] = credit['hour'].apply(lambda x: 1 if (x >= 22 or x <= 6) else 0)

# Log transform of Amount to reduce skew

credit['amount_log'] = np.log1p(credit['Amount'])
In [66]:
# Check for missing values and Balance

print("\nMissing values in Credit Card:")
print(credit.isna().sum().sum())

print("\nClass balance in Credit Card:")
print(credit["Class"].value_counts())
Missing values in Credit Card:
0

Class balance in Credit Card:
Class
0    284315
1       492
Name: count, dtype: int64
In [67]:
# Check the new data set

print("CREDIT CARD DATA")
print("Shape:", credit.shape)
print(credit.head())
print("\nColumns:", credit.columns)

print("\n" + "="*60 + "\n")

print("Hour distribution:")
print(credit['hour'].value_counts().sort_index())

print("\nNight transactions count:")
print(credit['is_night'].value_counts())
CREDIT CARD DATA
Shape: (284807, 34)
   Time        V1        V2        V3        V4        V5        V6        V7  \
0   0.0 -1.359807 -0.072781  2.536347  1.378155 -0.338321  0.462388  0.239599   
1   0.0  1.191857  0.266151  0.166480  0.448154  0.060018 -0.082361 -0.078803   
2   1.0 -1.358354 -1.340163  1.773209  0.379780 -0.503198  1.800499  0.791461   
3   1.0 -0.966272 -0.185226  1.792993 -0.863291 -0.010309  1.247203  0.237609   
4   2.0 -1.158233  0.877737  1.548718  0.403034 -0.407193  0.095921  0.592941   

         V8        V9  ...       V24       V25       V26       V27       V28  \
0  0.098698  0.363787  ...  0.066928  0.128539 -0.189115  0.133558 -0.021053   
1  0.085102 -0.255425  ... -0.339846  0.167170  0.125895 -0.008983  0.014724   
2  0.247676 -1.514654  ... -0.689281 -0.327642 -0.139097 -0.055353 -0.059752   
3  0.377436 -1.387024  ... -1.175575  0.647376 -0.221929  0.062723  0.061458   
4 -0.270533  0.817739  ...  0.141267 -0.206010  0.502292  0.219422  0.215153   

   Amount  Class  hour  is_night  amount_log  
0  149.62      0   0.0         1    5.014760  
1    2.69      0   0.0         1    1.305626  
2  378.66      0   0.0         1    5.939276  
3  123.50      0   0.0         1    4.824306  
4   69.99      0   0.0         1    4.262539  

[5 rows x 34 columns]

Columns: Index(['Time', 'V1', 'V2', 'V3', 'V4', 'V5', 'V6', 'V7', 'V8', 'V9', 'V10',
       'V11', 'V12', 'V13', 'V14', 'V15', 'V16', 'V17', 'V18', 'V19', 'V20',
       'V21', 'V22', 'V23', 'V24', 'V25', 'V26', 'V27', 'V28', 'Amount',
       'Class', 'hour', 'is_night', 'amount_log'],
      dtype='object')

============================================================

Hour distribution:
hour
0.0      7695
1.0      4220
2.0      3328
3.0      3492
4.0      2209
5.0      2990
6.0      4101
7.0      7243
8.0     10276
9.0     15838
10.0    16598
11.0    16856
12.0    15420
13.0    15365
14.0    16570
15.0    16461
16.0    16453
17.0    16166
18.0    17039
19.0    15649
20.0    16756
21.0    17703
22.0    15441
23.0    10938
Name: count, dtype: int64

Night transactions count:
is_night
0    230393
1     54414
Name: count, dtype: int64

Are there more frauds during the night?

In [68]:
# Total amount of frauds.

fraud_df = credit[credit["Class"] == 1]
print("Total fraudulent transactions:", len(fraud_df))

# At night?

fraud_night_counts = fraud_df["is_night"].value_counts()
print("Fraud count by night/day:")
print(fraud_night_counts)

fraud_night_percent = fraud_df["is_night"].value_counts(normalize=True) * 100
print("\nFraud percentage by night/day:")
print(fraud_night_percent)

# Particular Hour?

fraud_by_hour = fraud_df.groupby("hour").size()
print(fraud_by_hour)
Total fraudulent transactions: 492
Fraud count by night/day:
is_night
0    329
1    163
Name: count, dtype: int64

Fraud percentage by night/day:
is_night
0    66.869919
1    33.130081
Name: proportion, dtype: float64
hour
0.0      6
1.0     10
2.0     57
3.0     17
4.0     23
5.0     11
6.0      9
7.0     23
8.0      9
9.0     16
10.0     8
11.0    53
12.0    17
13.0    17
14.0    23
15.0    26
16.0    22
17.0    29
18.0    33
19.0    19
20.0    18
21.0    16
22.0     9
23.0    21
dtype: int64

Prediction with Machine Learning, BUT first balance and scale the dataset¶

In [69]:
# Separate features & target

X = credit.drop("Class", axis=1)
y = credit["Class"]

# Train/test split (ALWAYS before balancing)

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(
    X, y,
    test_size=0.2,
    stratify=y, # keeps same fraud ratio in test set
    random_state=42
)

# Scale the data !!! Fit only on train, then transform both.

from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()

X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Balance the training set (SMOTE – best for fraud)

!pip install imbalanced-learn

from imblearn.over_sampling import SMOTE

smote = SMOTE(random_state=42)
X_train_balanced, y_train_balanced = smote.fit_resample(X_train_scaled, y_train)

print("Before balancing:", y_train.value_counts())
print("After balancing:", pd.Series(y_train_balanced).value_counts())
Requirement already satisfied: imbalanced-learn in ./myenv/lib/python3.12/site-packages (0.14.0)
Requirement already satisfied: numpy<3,>=1.25.2 in ./myenv/lib/python3.12/site-packages (from imbalanced-learn) (2.3.0)
Requirement already satisfied: scipy<2,>=1.11.4 in ./myenv/lib/python3.12/site-packages (from imbalanced-learn) (1.15.3)
Requirement already satisfied: scikit-learn<2,>=1.4.2 in ./myenv/lib/python3.12/site-packages (from imbalanced-learn) (1.7.0)
Requirement already satisfied: joblib<2,>=1.2.0 in ./myenv/lib/python3.12/site-packages (from imbalanced-learn) (1.5.1)
Requirement already satisfied: threadpoolctl<4,>=2.0.0 in ./myenv/lib/python3.12/site-packages (from imbalanced-learn) (3.6.0)
Before balancing: Class
0    227451
1       394
Name: count, dtype: int64
After balancing: Class
0    227451
1    227451
Name: count, dtype: int64

Logistic Linear Regression

In [70]:
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report

model = LogisticRegression(max_iter=1000)
model.fit(X_train_balanced, y_train_balanced)

y_pred = model.predict(X_test_scaled)

print(classification_report(y_test, y_pred))
              precision    recall  f1-score   support

           0       1.00      0.97      0.99     56864
           1       0.06      0.92      0.10        98

    accuracy                           0.97     56962
   macro avg       0.53      0.95      0.55     56962
weighted avg       1.00      0.97      0.98     56962

Metric Value Meaning

Recall = 0.92 - Model catches 92% of all frauds (very good) Precision = 0.05 - Only 5% of fraud predictions are actually fraud F1-score = 0.10 Poor - Because precision is extremely low

My model is acting like this: “If I think it might be fraud, I’ll just call it fraud.”

So it catches most frauds. BUT it also flags huge numbers of normal transactions as fraud.

This would: Annoy customers Block legit cards Cause false alarms

Because I used SMOTE or balancing, the model became very sensitive to fraud examples but cannot differentiate well enough yet.

In [71]:
# Precision recall curve

from sklearn.metrics import precision_recall_curve
import matplotlib.pyplot as plt

precision, recall, th = precision_recall_curve(y_test, y_probs)

plt.figure()
plt.plot(th, precision[:-1], label="Precision")
plt.plot(th, recall[:-1], label="Recall")

plt.xlabel("Threshold")
plt.ylabel("Score")
plt.title("Precision & Recall vs Threshold")
plt.legend()
plt.show()
No description has been provided for this image
In [72]:
Best_threshold = 0.90

y_pred_tuned = (y_probs >= best_threshold).astype(int)

from sklearn.metrics import classification_report

print(classification_report(y_test, y_pred_tuned))
              precision    recall  f1-score   support

           0       1.00      1.00      1.00     56864
           1       0.93      0.84      0.88        98

    accuracy                           1.00     56962
   macro avg       0.97      0.92      0.94     56962
weighted avg       1.00      1.00      1.00     56962

The model is doing ok but not great

Linear models are limited when classes overlap heavily or the minority class is very sparse.

Prediction Model with Random Forest

In [73]:
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report, precision_recall_curve, f1_score
from imblearn.over_sampling import SMOTE

X_train_res, y_train_res = X_train, y_train  # Replace if SMOTE needed

# Initialize Random Forest with balanced class weights

rf = RandomForestClassifier(
    n_estimators=500,        # more trees improve stability
    max_depth=None,          # let trees grow fully, prevents underfitting
    min_samples_split=5,     # avoids overfitting to tiny nodes
    min_samples_leaf=2,      # ensures each leaf has enough samples
    max_features='sqrt',     # reduces correlation among trees
    class_weight='balanced', # handles imbalance automatically
    random_state=42,
    n_jobs=-1
)

# Train the model
rf.fit(X_train_res, y_train_res)

# Instead of default 0.5, pick the threshold that maximizes F1-score for class 1:

from sklearn.metrics import precision_recall_curve, classification_report

# Predict probabilities for the positive class
y_probs = rf.predict_proba(X_test)[:, 1]

# Calculate precision, recall, thresholds
precisions, recalls, thresholds = precision_recall_curve(y_test, y_probs)

# Compute F1 for each threshold
f1_scores = 2 * precisions * recalls / (precisions + recalls + 1e-8)
best_idx = f1_scores.argmax()
best_threshold = thresholds[best_idx]

print("Best threshold for max F1:", best_threshold)

# Apply threshold
y_pred_tuned = (y_probs >= best_threshold).astype(int)

# Evaluate
print(classification_report(y_test, y_pred_tuned))

# Hyperparameter tunning
param_grid = {
    'n_estimators': [300, 500, 700],
    'max_depth': [None, 10, 20, 30],
    'min_samples_split': [2, 5, 10],
    'min_samples_leaf': [1, 2, 4],
    'max_features': ['sqrt', 'log2', None]
}
Best threshold for max F1: 0.4197894472112458
              precision    recall  f1-score   support

           0       1.00      1.00      1.00     56864
           1       0.94      0.82      0.87        98

    accuracy                           1.00     56962
   macro avg       0.97      0.91      0.94     56962
weighted avg       1.00      1.00      1.00     56962

Interpretation:

  • Class 1 (minority class) is now predicted very accurately. F1-score jumped from 0.30 → 0.88.
  • Precision is very high → very few false positives.
  • Recall is still strong → most true positives are captured.

Key Takeaways:

  • Random Forest + SMOTE + threshold tuning worked perfectly.
  • F1 optimization for class 1 is the right approach for imbalanced datasets.
  • Threshold tuning can make a huge difference, especially for rare classes.
In [74]:
import matplotlib.pyplot as plt
from sklearn.metrics import precision_recall_curve

# Define zoom range around best threshold

best_threshold = 0.4097
zoom_margin = 0.1
thresholds_zoom = np.linspace(best_threshold - zoom_margin, best_threshold + zoom_margin, 100)

# Compute precision, recall, F1 for each threshold

precisions = []
recalls = []
f1_scores = []

for t in thresholds_zoom:
    y_pred = (y_probs >= t).astype(int)
    precisions.append(precision_score(y_test, y_pred))
    recalls.append(recall_score(y_test, y_pred))
    f1_scores.append(f1_score(y_test, y_pred))

# Plot
plt.figure(figsize=(10,6))
plt.plot(thresholds_zoom, precisions, label='Precision', color='blue')
plt.plot(thresholds_zoom, recalls, label='Recall', color='green')
plt.plot(thresholds_zoom, f1_scores, label='F1-score', color='red')
plt.axvline(x=best_threshold, color='black', linestyle='--', label='Best F1 Threshold')
plt.xlabel('Threshold')
plt.ylabel('Score')
plt.title('Precision, Recall, F1 vs Threshold (Fast Zoomed In)')
plt.legend()
plt.grid(True)
plt.show()
No description has been provided for this image